294 PART 5 Looking for Relationships with Correlation and Regression

Adjusting for confounders

When designing a regression analysis, you first have to decide: Are you doing an

exploratory analysis, or are you doing a hypothesis-driven analysis? If you are

doing an exploratory analysis, you do not have a pre-supposed hypothesis. Instead,

your aim is to answer the research question, “What group of covariates do I need

to include as independent variables in my regression to predict the outcome and

get the best model fit?” In this case, you need to select a set of candidate covari-

ates and then come up with modeling rules to decide which groups of covariates

produce the best-fitting model. In each chapter on regression in this book, we

provide methods of comparing models using model-fit statistics. You would use

those to choose your final model for your exploratory analysis. Exploratory analy-

ses are considered descriptive studies, and are weak study designs (see

Chapter 7).

But if you collected your data based on a hypothesis, you are doing a hypothesis-

driven analysis. Epidemiologic studies require hypothesis-driven analyses, where

you have already selected your exposure and outcome, and now you have to fit a

regression model predicting the outcome, but including your exposure and con-

founders as covariates. You know you need to include the exposure and the out-

come in every model you run. However, you may not know how to decide on which

confounders stay in the model.

Regardless of whether you are doing exploratory or hypothesis-driven modeling,

you need to make rules before you start modeling that describe how you will make

decisions about your final model and during your modeling process. You may

make a rule that all the covariates in your final model must be associated with a p

value that is statistically significant at α = 0.05. You can make other stipulations

about the final model, or the process of achieving the final model. What is impor-

tant is that you make the modeling rules and write them down before you start

modeling.

You then need to choose a modeling approach, which is the approach you will use

to determine which candidate confounders stay in the model with the exposure

and which ones are removed. There are three common approaches in regression

modeling (although analysts have their customized approaches). These approaches

don’t have official names, but we will use terms that are commonly used. They

are: forward stepwise, backward elimination, and stepwise selection.»

» Forward stepwise: This is where one confounder covariate at a time is added

to the model in iterative models. If it does not meet rules to be kept in the

model, it is removed and never considered again in the model. Imagine you

were fitting a regression model with one exposure covariate and eight candi-

date confounders. Suppose that you add the first covariate with the exposure

and it meets modeling rules, so you keep it. But when you add the second